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Abstract —In this paper, we propose the problem of optimizing 
multivariate performance measures from multi-view data, and 
an effective method to solve it. This problem has two features: 
the data points are presented by multiple views, and the target 
of learning is to optimize complex multivariate performance 
measures. We propose to learn a linear discriminant functions for 
each view, and combine them to construct a overall multivariate 
mapping function for mult-view data. To learn the parameters of 
the linear discriminant functions of different views to optimize 
multivariate performance measures, we formulate a optimization 
problem. In this problem, we propose to minimize the complexity 
of the linear discriminant functions of each view, encourage the 
consistences of the responses of different views over the same data 
points, and minimize the upper boundary of a given multivariate 
performance measure. To optimize this problem, we employ the 
cutting-plane method in an iterative algorithm. In each iteration, 
we update a set of constrains, and optimize the mapping function 
parameter of each view one by one. 

Index Terms —Multivariate performance measures. Multi-view 
learning. Cutting-plane algorithm 

I. Introduction 

D ifferent multivariate performance measures are used 
to evaluate different machine learning applications m. 
Eor example, in problems of text classification, El-score and 
precision/recall breakeven point (PRBEP) are used to compare 
true class labels against predicted class labels of a given test 
text set. In image retrieval problems, the precision/recall at the 
top k returned images are used to evaluate the performance 
of a retrieval system. Recently, the problem of multivariate 
performance measures optimization are proposed to learn di¬ 
rectly to losses based on pre-debned multivariate performance 
measures. Many algorithms have been proposed to solve this 
problem, and some examples are listed as follows. 

• Joachims 12 proposed a support vector machine (SVM)- 
based for multivariate nonlinear performance measures 
optimization problems. The proposed algorithm can train 
a multivariate SVM in polynomial time for large classes 
of potentially non-linear performance measures. More¬ 
over, the traditional SAM can be arised as a special case 
of this method. 

> Li et al. Q proposed a two-step approach to opti¬ 
mize multivariate performance measures, by hrst training 
a nonlinear classihers with existing learning methods, 
and then adapting it to optimize specihc performance 
measures. In the seconde step, the classiher adaptation 
problem can be solved as a quadratic program problem, 
in a similar way to linear SVM. 

« Mao and Tsang ID proposed a novel feature selection 
method to optimize multivariate performance measures. 


by formulating the problem or high-dimensional data, and 
employing a two-layer cutting plane algorithm to solve it. 
Moreover, this method is also used in multiple-instance 
learning problems. 

Up to now, all these multivariate methods are limited to 
learn from data with single view. However, in many real- 
world machine learning problems, the data can be presented by 
multiple views. Eor example, in computer vision problems, we 
can extract different types of features, such as color features, 
texture features, and shape features, and each feature can be 
treated as a view. In scientihc article classihcation problems, 
we can also learn from the views of article abstract, content, 
and references. Different views of data may be complemental 
to each other in a learning problem and using multiple views 
has been a popular strategy in machine learning community. To 
learn from multiple views of data, different multi-view learn¬ 
ing methods have been proposed IH, fS), ||6l, 121. However, 
none of them are designed to optimized a specihc multivariate 
performance measure. 

To overcome this problem, in this paper, we propose the 
problem of learning from multiple view data to optimize 
the multivariate performance measures. Given a tuple of data 
points, each of them are presented with multiple views. The 
problem is to learn a multivariate mapping function to map 
them to a tuple of class labels, so that the multivariate perfor¬ 
mance measures can be optimized. To solve this problem, we 
proposed a novel method for multi-view learning to optimize 
multivariate performance measures. The contribution of this 
paper are of two folds: 

1) We propose the problem of multi-view learning for 
multivariate performance measures. Although there are 
plenty of multivariate performance measures optimiza¬ 
tion methods, they are limited to single view data. There 
are also lots of multi-view learning methods, however, 
none of them are proposed to optimize multivariate 
performance measures. 

2) We also propose a novel method to solve this problem. 
We proposed to learn a linear discriminant functions for 
each view, and combine them to construct a overall mul¬ 
tivariate mapping function to predict the class label tuple 
of a tuple of data points. To learn the linear discriminant 
functions parameters of different views, we formulate a 
constrained minimization problem. In this problem, we 
proposed to minimize the complexity of each linear dis¬ 
criminant functions parameter by minimizing its squared 
£2 norm, encourage the consistence among different 
views by minimizing the squared £2 norm distance 
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among different linear discriminant functions response 
of differen views, and also minimize the losses based 
on a specific multivariate performance measures. The 
minimization problem is optimized by a cutting-plane 
method in an alterative algorithm 0. We maintain a 
active set of constrains, and update it in each iteration by 
add a most violated class label tuple. We also optimize 
the multivariate mapping function parameters one by 
one, and the optimization of a multivariate mapping 
function parameter can be solve as a quadratic program 
problem. 

The rest parts of this paper are organized as follows: in 
section m we introduce the proposed method, and in section 
mil we conclude this paper. 

II. Proposed method 
A. Problem formulation 

We assume we have a training data set of n data points, 
and each data point has m views. The training set is denoted 
as {(x^ 1 ^ 1 ,where is the dj-dimensional 

feature vector of the j-th view of the i-th data point, and 
Ui € {-fl,—1} is the binary class label of the i-th data point. 
We propose to learn a multivariate mapping function h to map 
a tuple of n data points of m views, x = (xj|JT]^, • • • 
to tuple of n class labels, y = {yi, ■' ' : 2/ri)- To implement this 
multivariate mapping function, we use a linear discriminant 
function fj{x^,y') to predict the response of a view the the 
data tuple, x^ = (x{, • • • ,x^), against a candidate class label 
tuple y' = {y[,--- 


min 


”i\T=i 


- m 


(4) 


2) Encouraging consistences among different views: To 

encourage the consistences of different views, we pro¬ 
posed to minimize the squared £2 norm distances of 
responses of any two linear discriminative functions over 
one single data point. 


min - 

2 ^ 

3,3'- 3 '<3 
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iwjx^-w;x^ 
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3) Optimizing multivariate performance measures: To 

optimize a specihc multivariate performance measure, 
we propose to minimize the upper boundary of a loss 
function A based on this multivariate performance mea¬ 
sure. 


min ^ 

s.f.Vy' e y/y : 


[A!{x\y) -^{xfy')] >A(y',y)-^, 
i=i 

( 6 ) 

where ^ is a slack variable of the upper boundary of the 
loss function. 


n 

/J■(x^y') = y]]wjT'(x^y') (1) 

where Wj S is a parameter vector for the linear discrim¬ 
inant function of the j-th view, , and T'(x'^,y) is a function 
which can returns a vector to describe the match between x^ 
and y', defined as follows, 

n 

'i>{xff) = J2y'i^l- ( 2 ) 

i=l 

Based on the linear discriminant functions of different views, 
we construct the multivariate mapping function as follows. 


h{x) = arg maxy,^y I fj (x^, y') = wJT-(x^ ,y')\ 

[i=i 3=1 ) 

(3) 

where y = {+1, —1}" is a set of all admissible label vectors. 

We propose to optimize a complex multivariate performance 
measure. A, by learning the parameter vectors for 

the m views. To this end, we consider the following three 
problems, 

1) Reducing the complexity of each linear discrimi¬ 
native function: To prevent over-fitting, we propose 
to reduce the complexity of the linear discriminative 
function of each view by minimizing the squared £2 
norm of its parameter. 


The overall optimization function are obtained by combin¬ 
ing all the problems above. 


4 i: it 

3,3'- 3 '<3 V*=l 

s.f.Vy' G y/y : 


Iwjx^-wjx^ Hi 


(7) 


[T'(x^y) - T'(a:'’,y')] >A(y',y)-^. 
3=1 


In the objective of this problem, the first term is to reduce the 
complexity of the parameter vector of each view, and second 
term is a slack variable to represent the upper boundary of 
the multivariate performance measure, and the third term is to 
encourage the consistences of the responses of different views. 
Cl and C 2 are tradeoff parameters. 


B. Problem optimization 

To solve this problem, we employ the cutting-plane al¬ 
gorithm. In an iterative algorithm, we update the parameter 
vectors Wj {//Li and an active set of constrains W alternately. 
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1) Updating WjlJLi: When we have a given active set of 
constrain W C y/y, and optimize the parameter vectors, we 
optimize Wj one by one, i.e., when Wj is optimized, 
is fixed. The following optimization is obtained in this case. 


Wj,? \ z 


+f i: i: 

\i=l 

s.tMy' G W : 


Iwjxf-wjx^ 11^ 






[^{x\y) - ^{x\y')] 


>My',y)- X! ,y)-'^{x^y') 

j''-3'¥=3 

The Lagrange function of this problem is 


-C- 

( 8 ) 


To obtain the optimal points of Wj and we set the 
gradients of Lagrangian function with respect to Wj and ^ 
to zeros, and we have. 


dC 


— = riw, - /3 - ^ ayHy' = 0 


y':y'&W 


=4^ Wj = M ^ ^ aY7y> I , 

\ y':y'GW 
^ y’-.y’eW 

X! «y'=C'i- 

y'-.y’&W 


( 12 ) 


By substituting these results back to (fTTI) . we obtain the dual 
problem as 


j ^y'ly'-y'&w) 

= ^11^^112+^1^ 




uX^3 _ „,T„i II2 


\i=l 


- ay/ I wj [T'(x^ 2 /) - )] - ^(y'.y) 

y':y'€W 


+ E '^(x^' ^y) -'^(x^' ^y') +? 

3'-3'=A3 


=iwjf7wj+C'i^-wJ/3 +^ Y 

i=l 


-_E «17'(w77y' -%'+?) 

y':y'&W 


( 9 ) 


where ay' > 0 is the Lagrange multiplier for the 

constrain wj y) — T'(T'^, y')] > A{y\y) — 


^3'--3'^3 'y)-'^{x^' y') 


— and 


a={i + c, Y. H 

3 '- 3'<3 ®=1 


X'^ X'^ ''' I , 
I I I ’ 


3 3 '^ 

'4'^i wy', 


/3 = ^2 E E 

3 '- 3'<3 t=l 

Ty' = 


%/=(A(y',y)- Y w7 'S{x^',y)-'S{x^',y') 

( 10 ) 

This optimization problem can be transferred to its dual form, 
max min £{wj,^,ay'\y'.y'^w) 

°‘v'\y':y'£W (JJ) 

s . t . Vy ', y ' G W : ay > 0 . 


|4I 

(3+ Y ^v'^y' 

1 

i 

\ y':y'eW ) 

\ 


/ . xty'-jy, 
y'-Py'SiW 


E ^v'^v' 

y'-.y'&W 


S . t . Vy',y' G W : a - y / > 0, and Y^ ^ y ' = C'l- 

y'ry'GW 

( 13 ) 

We can solve this problem as a quadratic program problem. 

2) Updating W: To update W, we first find the most 
violated y' and then add it the W. y' is obtained as 


{ m 

A(y", y) + X! E ^ ’ y") 

3 = 1 

m I 


(14) 


i) Alterative algorithm: The proposed iterative algorithm 
is given in Algorithm [T] 


III. Conclusion 

In this paper, we propose the problem of multi-view learning 
for multivariate performance measures, and a novel algorithm 
to solve it. Multivariate performance measures optimization 
can obtain a predictor which directly optimize a desired 
performance measure. However, the existing methods are 
limited to single view data, and cannot utilize multi-view data. 
We propose to learn from multi-view data to optimize the 
multivariate performance measures, and it has potential in real- 
world applications. The proposed method can be implemented 
easily due to its employment of cutting plane method and 
quadratic program. 
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Algorithm 1 Iterative multi-view learning algorithm for mul¬ 
tivariate performance measures. 

Input: Multi-view ti'aining data set {(x^ |J^i,2/i)}|"=i; 
Input: Tradeoff parameters Ci and C 2 ; 

Input: The maximum iteration number T. 

Initialize linear discriminative function parameter w° for 
each view; 

W ^ 0; 

Calculate as in (fTol l: 
for f = 1, • •• ,T do 

Find the most violated class label tuple y'* as in (fT4l i by 
fixing 

Add y'* to the constrain active set W, W ■(— W U {y'*}; 

for j = 1, • • • , m do 

Update w* by solving (fOl) by fixing and W; 

end for 
end for 

Output: yvJlJLi- 
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